Optimization of MFNs for signal-based phrase break prediction
نویسندگان
چکیده
The automatic prosodic annotation of large speech corpora gains increasing consideration since appropriate databases for the training of prosodic models in speech synthesis and recognition are needed. On linguistic level, correct phrase and accent marking are essential processing steps. The authors developed a neural network based method for signal-based phrase break prediction and tested this method across two different speech databases. The structure of the multilayer feed-forward neural network (MFN) had been optimized and adapted to the target database and to the specific annotation task. The method is rather data sensitive—depending on different human labelers and small differences across training databases, like frequency of occurrence or strength of phrase breaks. The MFN method can be easily adapted to the characteristics of different databases (long or short phrases, special formats like dates or web addresses, etc.). If applied to different databases which contain phrase markers of human experts, phrase break recognition rates vary from 79% up to 97%.
منابع مشابه
Traffic Signal Prediction Using Elman Neural Network and Particle Swarm Optimization
Prediction of traffic is very crucial for its management. Because of human involvement in the generation of this phenomenon, traffic signal is normally accompanied by noise and high levels of non-stationarity. Therefore, traffic signal prediction as one of the important subjects of study has attracted researchers’ interests. In this study, a combinatorial approach is proposed for traffic signal...
متن کاملLearning methods and features for corpus-based phrase break prediction on Thai
This paper presents applications of five famous learning methods for Thai phrase break prediction. Phrase break prediction is particularly important for our Thai text-to-speech synthesizer (TTS), where input Thai text has no word and sentence boundary. The learning methods include a POS sequence model, CART, RIPPER, SLIPPER and neural network. Features proposed for the learning machines can be ...
متن کاملIncorporating second-order information into two-step major phrase break prediction for Korean
In this paper, we present a new phrase break prediction method that integrates second-order information into general maximum entropy model. The phrase break prediction problem was mapped into a classification problem in our research. The features we used for the prediction of phrase breaks are of several layers such as local features (part-of-speech (POS) tags, a lexicon, lengths of eojeols and...
متن کاملChinese prosody phrase break prediction based on maximum entropy model
A maximum entropy based model for prosody phrase break prediction was proposed in this paper, and a comparison was conducted on large corpora between the new model and the decision tree based model which was the mainstream method for prosody phrase break prediction. The contribution of lexical information and influences of different cutoff values were also investigated. It was demonstrated that...
متن کاملTODO: This is a placeholder. Final title will be filled later
In this paper, we present a new phrase break prediction method that integrates second-order information into general maximum entropy model. The phrase break prediction problem was mapped into a classification problem in our research. The features we used for the prediction of phrase breaks are of several layers such as local features (part-of-speech (POS) tags, a lexicon, lengths of eojeols and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006